AITopics | retinal image

Collaborating Authors

retinal image

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Automated Diagnosis of Retinopathy of Prematurity by Customize CNN Models

Saeedi, Farzan, Keshvari, Sanaz, Shoeibi, Nasser

arXiv.org Artificial IntelligenceNov-14-2025

This paper encompasses an in-depth examination of Retinopathy of Prematurity (ROP) diagnosis, employing advanced deep learning methodologies. Our focus centers on refining and evaluating CNN-based approaches for precise and efficient ROP detection. We navigate the complexities of dataset curation, preprocessing strategies, and model architecture, aligning with research objectives encompassing model effectiveness, computational cost analysis, and time complexity assessment. Results underscore the supremacy of tailored CNN models over pre-trained counterparts, evident in heightened accuracy and F1-scores. Implementation of a voting system further enhances performance. Additionally, our study reveals the potential of the proposed customized CNN model to alleviate computational burdens associated with deep neural networks. Furthermore, we showcase the feasibility of deploying these models within dedicated software and hardware configurations, highlighting their utility as valuable diagnostic aids in clinical settings. In summary, our discourse significantly contributes to ROP diagnosis, unveiling the efficacy of deep learning models in enhancing diagnostic precision and efficiency.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.10023

Country: Asia > Middle East > Oman (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DeepEyeNet: Generating Medical Report for Retinal Images

Huang, Jia-Hong

arXiv.org Artificial IntelligenceSep-17-2025

The increasing prevalence of retinal diseases poses a significant challenge to the healthcare system, as the demand for ophthalmologists surpasses the available workforce. This imbalance creates a bottleneck in diagnosis and treatment, potentially delaying critical care. Traditional methods of generating medical reports from retinal images rely on manual interpretation, which is time-consuming and prone to errors, further straining ophthalmologists' limited resources. This thesis investigates the potential of Artificial Intelligence (AI) to automate medical report generation for retinal images. AI can quickly analyze large volumes of image data, identifying subtle patterns essential for accurate diagnosis. By automating this process, AI systems can greatly enhance the efficiency of retinal disease diagnosis, reducing doctors' workloads and enabling them to focus on more complex cases. The proposed AI-based methods address key challenges in automated report generation: (1) A multi-modal deep learning approach captures interactions between textual keywords and retinal images, resulting in more comprehensive medical reports; (2) Improved methods for medical keyword representation enhance the system's ability to capture nuances in medical terminology; (3) Strategies to overcome RNN-based models' limitations, particularly in capturing long-range dependencies within medical descriptions; (4) Techniques to enhance the interpretability of the AI-based report generation system, fostering trust and acceptance in clinical practice. These methods are rigorously evaluated using various metrics and achieve state-of-the-art performance. This thesis demonstrates AI's potential to revolutionize retinal disease diagnosis by automating medical report generation, ultimately improving clinical efficiency, diagnostic accuracy, and patient care.

jia-hong huang, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.12534

Country: Oceania > Australia (0.29)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DRetNet: A Novel Deep Learning Framework for Diabetic Retinopathy Diagnosis

Okuwobi, Idowu Paul, Liu, Jingyuan, Wan, Jifeng, Jiang, Jiaojiao

arXiv.org Artificial IntelligenceSep-3-2025

Diabetic retinopathy (DR) is a leading cause of blindness worldwide, necessitating early detection to prevent vision loss. Current automated DR detection systems often struggle with poor-quality images, lack interpretability, and insufficient integration of domain-specific knowledge. To address these challenges, we introduce a novel framework that integrates three innovative contributions: (1) Adaptive Retinal Image Enhancement Using Physics-Informed Neural Networks (PINNs): this technique dynamically enhances retinal images by incorporating physical constraints, improving the visibility of critical features such as microaneurysms, hemorrhages, and exudates; (2) Hybrid Feature Fusion Network (HFFN): by combining deep learning embeddings with handcrafted features, HFFN leverages both learned representations and domain-specific knowledge to enhance generalization and accuracy; (3) Multi-Stage Classifier with Uncertainty Quantification: this method breaks down the classification process into logical stages, providing interpretable predictions and confidence scores, thereby improving clinical trust. The proposed framework achieves an accuracy of 92.7%, a precision of 92.5%, a recall of 92.6%, an F1-score of 92.5%, an AUC of 97.8%, a mAP of 0.96, and an MCC of 0.85. Ophthalmologists rated the framework's predictions as highly clinically relevant (4.8/5), highlighting its alignment with real-world diagnostic needs. Qualitative analyses, including Grad-CAM visualizations and uncertainty heatmaps, further enhance the interpretability and trustworthiness of the system. The framework demonstrates robust performance across diverse conditions, including low-quality images, noisy data, and unseen datasets. These features make the proposed framework a promising tool for clinical adoption, enabling more accurate and reliable DR detection in resource-limited settings.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2509.01072

Country: Asia > China (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Explainable Transformer Model for Alzheimer's Disease Detection Using Retinal Imaging

Jamshidiha, Saeed, Rezaee, Alireza, Hajati, Farshid, Golzan, Mojtaba, Chiong, Raymond

arXiv.org Artificial IntelligenceJul-8-2025

Alzheimer's disease (AD) is a neurodegenerative disorder that affects millions worldwide. In the absence of effective treatment options, early diagnosis is crucial for initiating management strategies to delay disease onset and slow down its progression. In this study, we propose Retformer, a novel transformer-based architecture for detecting AD using retinal imaging modalities, leveraging the power of transformers and explainable artificial intelligence. The Retformer model is trained on datasets of different modalities of retinal images from patients with AD and age-matched healthy controls, enabling it to learn complex patterns and relationships between image features and disease diagnosis. To provide insights into the decision-making process of our model, we employ the Gradient-weighted Class Activation Mapping algorithm to visualize the feature importance maps, highlighting the regions of the retinal images that contribute most significantly to the classification outcome. These findings are compared to existing clinical studies on detecting AD using retinal biomarkers, allowing us to identify the most important features for AD detection in each imaging modality. The Retformer model outperforms a variety of benchmark algorithms across different performance metrics by margins of up to 11\.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.04259

Country:

Asia > Middle East > Iran (0.04)
Oceania > Australia > New South Wales > Callaghan (0.04)
Europe > Switzerland (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Integrating Non-Linear Radon Transformation for Diabetic Retinopathy Grading

Mohsen, Farida, Belhaouari, Samir, Shah, Zubair

arXiv.org Artificial IntelligenceApr-23-2025

Diabetic retinopathy is a serious ocular complication that poses a significant threat to patients' vision and overall health. Early detection and accurate grading are essential to prevent vision loss. Current automatic grading methods rely heavily on deep learning applied to retinal fundus images, but the complex, irregular patterns of lesions in these images, which vary in shape and distribution, make it difficult to capture subtle changes. This study introduces RadFuse, a multi-representation deep learning framework that integrates non-linear RadEx-transformed sinogram images with traditional fundus images to enhance diabetic retinopathy detection and grading. Our RadEx transformation, an optimized non-linear extension of the Radon transform, generates sinogram representations to capture complex retinal lesion patterns. By leveraging both spatial and transformed domain information, RadFuse enriches the feature set available to deep learning models, improving the differentiation of severity levels. We conducted extensive experiments on two benchmark datasets, APTOS-2019 and DDR, using three convolutional neural networks (CNNs): ResNeXt-50, MobileNetV2, and VGG19. RadFuse showed significant improvements over fundus-image-only models across all three CNN architectures and outperformed state-of-the-art methods on both datasets. For severity grading across five stages, RadFuse achieved a quadratic weighted kappa of 93.24%, an accuracy of 87.07%, and an F1-score of 87.17%. In binary classification between healthy and diabetic retinopathy cases, the method reached an accuracy of 99.09%, precision of 98.58%, and recall of 99.6%, surpassing previously established models. These results demonstrate RadFuse's capacity to capture complex non-linear features, advancing diabetic retinopathy classification and promoting the integration of advanced mathematical transforms in medical image analysis.

artificial intelligence, machine learning, radex, (18 more...)

arXiv.org Artificial Intelligence

2504.15883

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RetinalGPT: A Retinal Clinical Preference Conversational Assistant Powered by Large Vision-Language Models

Zhu, Wenhui, Li, Xin, Chen, Xiwen, Qiu, Peijie, Vasa, Vamsi Krishna, Dong, Xuanzhao, Chen, Yanxi, Lepore, Natasha, Dumitrascu, Oana, Su, Yi, Wang, Yalin

arXiv.org Artificial IntelligenceMar-5-2025

Recently, Multimodal Large Language Models (MLLMs) have gained significant attention for their remarkable ability to process and analyze non-textual data, such as images, videos, and audio. Notably, several adaptations of general-domain MLLMs to the medical field have been explored, including LLaVA-Med. However, these medical adaptations remain insufficiently advanced in understanding and interpreting retinal images. In contrast, medical experts emphasize the importance of quantitative analyses for disease detection and interpretation. This underscores a gap between general-domain and medical-domain MLLMs: while general-domain MLLMs excel in broad applications, they lack the specialized knowledge necessary for precise diagnostic and interpretative tasks in the medical field. To address these challenges, we introduce \textit{RetinalGPT}, a multimodal conversational assistant for clinically preferred quantitative analysis of retinal images. Specifically, we achieve this by compiling a large retinal image dataset, developing a novel data pipeline, and employing customized visual instruction tuning to enhance both retinal analysis and enrich medical knowledge. In particular, RetinalGPT outperforms MLLM in the generic domain by a large margin in the diagnosis of retinal diseases in 8 benchmark retinal datasets. Beyond disease diagnosis, RetinalGPT features quantitative analyses and lesion localization, representing a pioneering step in leveraging LLMs for an interpretable and end-to-end clinical research framework. The code is available at https://github.com/Retinal-Research/RetinalGPT

dataset, knowledge, retinalgpt, (15 more...)

arXiv.org Artificial Intelligence

2503.03987

Country:

North America > United States > California (0.14)
North America > United States > Missouri > St. Louis County > St. Louis (0.04)
North America > United States > Arizona (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Object Detection for Medical Image Analysis: Insights from the RT-DETR Model

He, Weijie, Zhang, Yuwei, Xu, Ting, An, Tai, Liang, Yingbin, Zhang, Bo

arXiv.org Artificial IntelligenceJan-27-2025

Deep learning has emerged as a transformative approach for solving complex pattern recognition and object detection challenges. This paper focuses on the application of a novel detection framework based on the RT-DETR model for analyzing intricate image data, particularly in areas such as diabetic retinopathy detection. Diabetic retinopathy, a leading cause of vision loss globally, requires accurate and efficient image analysis to identify early-stage lesions. The proposed RT-DETR model, built on a Transformer-based architecture, excels at processing high-dimensional and complex visual data with enhanced robustness and accuracy. Comparative evaluations with models such as YOLOv5, YOLOv8, SSD, and DETR demonstrate that RT-DETR achieves superior performance across precision, recall, mAP50, and mAP50-95 metrics, particularly in detecting small-scale objects and densely packed targets. This study underscores the potential of Transformer-based models like RT-DETR for advancing object detection tasks, offering promising applications in medical imaging and beyond.

artificial intelligence, detection, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.16469

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Texas (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Contrastive Learning for Retinal Imaging via Adjusted Augmentation Scales

Cheng, Zijie, Li, Boxuan, Altmann, André, Keane, Pearse A, Zhou, Yukun

arXiv.org Artificial IntelligenceJan-5-2025

Contrastive learning, a prominent approach within self-supervised learning, has demonstrated significant effectiveness in developing generalizable models for various applications involving natural images. However, recent research indicates that these successes do not necessarily extend to the medical imaging domain. In this paper, we investigate the reasons for this suboptimal performance and hypothesize that the dense distribution of medical images poses challenges to the pretext tasks in contrastive learning, particularly in constructing positive and negative pairs. We explore model performance under different augmentation strategies and compare the results to those achieved with strong augmentations. Our study includes six publicly available datasets covering multiple clinically relevant tasks. We further assess the model's generalizability through external evaluations. The model pre-trained with weak augmentation outperforms those with strong augmentation, improving AUROC from 0.838 to 0.848 and AUPR from 0.523 to 0.597 on MESSIDOR2, and showing similar enhancements across other datasets. Our findings suggest that optimizing the scale of augmentation is critical for enhancing the efficacy of contrastive learning in medical imaging.

artificial intelligence, contrastive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.02451

Country:

Europe (0.29)
North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Diabetic Retinopathy Detection Using CNN with Residual Block with DCGAN

Aronno, Debjany Ghosh, Saeha, Sumaiya

arXiv.org Artificial IntelligenceJan-4-2025

Diabetic Retinopathy (DR) is a major cause of blindness worldwide, caused by damage to the blood vessels in the retina due to diabetes. Early detection and classification of DR are crucial for timely intervention and preventing vision loss. This work proposes an automated system for DR detection using Convolutional Neural Networks (CNNs) with a residual block architecture, which enhances feature extraction and model performance. To further improve the model's robustness, we incorporate advanced data augmentation techniques, specifically leveraging a Deep Convolutional Generative Adversarial Network (DCGAN) for generating diverse retinal images. This approach increases the variability of training data, making the model more generalizable and capable of handling real-world variations in retinal images. The system is designed to classify retinal images into five distinct categories, from No DR to Proliferative DR, providing an efficient and scalable solution for early diagnosis and monitoring of DR progression. The proposed model aims to support healthcare professionals in large-scale DR screening, especially in resource-constrained settings.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.023

Country:

Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GCS-M3VLT: Guided Context Self-Attention based Multi-modal Medical Vision Language Transformer for Retinal Image Captioning

Cherukuri, Teja Krishna, Shaik, Nagur Shareef, Bodapati, Jyostna Devi, Ye, Dong Hye

arXiv.org Artificial IntelligenceDec-22-2024

Retinal image analysis is crucial for diagnosing and treating eye diseases, yet generating accurate medical reports from images remains challenging due to variability in image quality and pathology, especially with limited labeled data. Previous Transformer-based models struggled to integrate visual and textual information under limited supervision. In response, we propose a novel vision-language model for retinal image captioning that combines visual and textual features through a guided context self-attention mechanism. This approach captures both intricate details and the global clinical context, even in data-scarce scenarios. Extensive experiments on the DeepEyeNet dataset demonstrate a 0.023 BLEU@4 improvement, along with significant qualitative advancements, highlighting the effectiveness of our model in generating comprehensive medical captions.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.17251

Country:

North America > United States (0.14)
Asia > India (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback